Skip to main content

Creating and Managing Datasets

Dataset Creation

Creating a dataset on Helix is a key operation that establishes a new data collection on the protocol. This process involves submitting a transaction with the required dataset params.

CLI Command

nuklaid tx dataset create-dataset <name> <ticker> <description> <url> <categories> <licenseName> <licenseSymbol> <licenseUrl> <isCommunityDataset> <metadata> --from=<key-name>

Example

nuklaid tx dataset create-dataset "Climate Data" "CLIM" "Global temperature readings" "https://example.com" "climate,science" "CC-BY" "CC" "https://creativecommons.org" true "{}" --from=bob

Transaction Flow

Under the hood, the creation process:

  1. Validates input parameters
  2. Generates a unique denom identifier
  3. Creates a dataset record in state
  4. Establishes an NFT class for future contributions
  5. Returns the generated denom to the client

The core implementation in the keeper:

func (k msgServer) CreateDataset(goCtx context.Context, msg *types.MsgCreateDataset) (*types.MsgCreateDatasetResponse, error) {
ctx := sdk.UnwrapSDKContext(goCtx)

// Generate a unique denom for the dataset
denom := generateDatasetDenom(msg.Owner, msg.Name, msg.Ticker, msg.Description, msg.Url, msg.Categories, msg.Metadata, msg.IsCommunityDataset)

// Create dataset entry
dataset := types.Dataset{
Owner: msg.Owner,
Denom: denom,
Name: msg.Name,
Ticker: msg.Ticker,
Description: msg.Description,
Url: msg.Url,
Supply: 0,
Categories: msg.Categories,
LicenseName: msg.LicenseName,
LicenseSymbol: msg.LicenseSymbol,
LicenseUrl: msg.LicenseUrl,
IsCommunityDataset: msg.IsCommunityDataset,
Metadata: msg.Metadata,
}

k.SetDataset(ctx, dataset)

// Create NFT class for the dataset
nftClass := nft.Class{
Id: denom,
Name: msg.Name,
Symbol: msg.Ticker,
Description: msg.Description,
Uri: msg.Url,
Data: &sdkCodec.Any{Value: []byte(msg.Metadata)},
}

if err := k.nftKeeper.SaveClass(ctx, nftClass); err != nil {
return nil, errorsmod.Wrap(err, "failed to create NFT class")
}

return &types.MsgCreateDatasetResponse{
Denom: denom,
}, nil
}

Updating Datasets

Once created, datasets can be updated to modify metadata, licensing information, and other fields.

CLI Command

nuklaid tx dataset update-dataset <denom> <name> <description> <url> <categories> <licenseName> <licenseSymbol> <licenseUrl> <isCommunityDataset> <metadata> --from=<key-name>

Permissions

Only the dataset owner can update it:

// Ensure only the dataset owner can update
if msg.Owner != dataset.Owner {
return nil, errorsmod.Wrap(sdkerrors.ErrUnauthorized, "incorrect owner")
}

Immutable Fields

While most fields can be updated, the denom and ticker fields are immutable after creation:

// Update dataset fields
dataset.Name = msg.Name
dataset.Description = msg.Description
dataset.Url = msg.Url
dataset.Categories = msg.Categories
dataset.LicenseName = msg.LicenseName
dataset.LicenseSymbol = msg.LicenseSymbol
dataset.LicenseUrl = msg.LicenseUrl
dataset.IsCommunityDataset = msg.IsCommunityDataset
dataset.Metadata = msg.Metadata

Note that the ticker field isn't updated, preserving their initial value.

Querying Datasets

List All Datasets

nuklaid query dataset list-dataset

Get Specific Dataset

nuklaid query dataset show-dataset <denom>

Dataset Lifecycle Management

Initial State

Datasets start with zero contributions (Supply: 0) and have their community status set at it's creation time.

Working with Multiple Datasets

For applications managing multiple datasets, efficient querying is supported through:

  1. Pagination: List queries support pagination parameters
  2. Filtering: Client-side filtering by owner, categories, or other attributes
  3. Indexing: Fast lookups by denom

Best Practices

  1. Unique Names: Use clear, unique names for datasets
  2. Rich Metadata: Provide comprehensive metadata for better discoverability by the community
  3. Appropriate Licensing: Choose licensing that aligns with your data
  4. Community Settings: Enable community contributions when you need broader collaboration from the community